2,734 research outputs found
Shallow Discourse Parsing with Maximum Entropy Model
In recent years, more research has been devoted to studying the subtask of
the complete shallow discourse parsing, such as indentifying discourse
connective and arguments of connective. There is a need to design a full
discourse parser to pull these subtasks together. So we develop a discourse
parser turning the free text into discourse relations. The parser includes
connective identifier, arguments identifier, sense classifier and non-explicit
identifier, which connects with each other in pipeline. Each component applies
the maximum entropy model with abundant lexical and syntax features extracted
from the Penn Discourse Tree-bank. The head-based representation of the PDTB is
adopted in the arguments identifier, which turns the problem of indentifying
the arguments of discourse connective into finding the head and end of the
arguments. In the non-explicit identifier, the contextual type features like
words which have high frequency and can reflect the discourse relation are
introduced to improve the performance of non-explicit identifier. Compared with
other methods, experimental results achieve the considerable performance
Improving Social Media Text Summarization by Learning Sentence Weight Distribution
Recently, encoder-decoder models are widely used in social media text
summarization. However, these models sometimes select noise words in irrelevant
sentences as part of a summary by error, thus declining the performance. In
order to inhibit irrelevant sentences and focus on key information, we propose
an effective approach by learning sentence weight distribution. In our model,
we build a multi-layer perceptron to predict sentence weights. During training,
we use the ROUGE score as an alternative to the estimated sentence weight, and
try to minimize the gap between estimated weights and predicted weights. In
this way, we encourage our model to focus on the key sentences, which have high
relevance with the summary. Experimental results show that our approach
outperforms baselines on a large-scale social media corpus
A memory mechanism based on two dimensional code of neurosome pattern
We have recognized that 2D codes, i.e., a group of strongly connected
neurosomes that can be simultaneously excited, are the basic data carriers for
memory in a brain. An echoing mechanism between two neighboring layers of
neurosomes is assumed to establish temporary memory, and repeating processes
enhance the formation of long-term memory. Creation and degradation of memory
information are statistically. The maximum capacity of memory storage in a
human brain is estimated to be one billion of 2D codes. By triggering one or
more neurosomes in a neurosome-based 2D code, the whole strongly connected
neurosome network is capable of exciting simultaneously and projecting its
excitation onto an analysis layer of neurons in cortex, thus retrieving the
stored memory data. The capability of comparing two 2D codes in the analysis
layer is one of the major brain functions.Comment: 9 pages, 2 figure
Layered structure and leveled function of a human brain
The anatomically layered structure of a human brain results in leveled
functions. In all these levels of different functions, comparison, feedback and
imitation are the universal and crucial mechanisms. Languages, symbols and
tools play key roles in the development of human brain and entire civilization
Learning Sentiment Memories for Sentiment Modification without Parallel Data
The task of sentiment modification requires reversing the sentiment of the
input and preserving the sentiment-independent content. However, aligned
sentences with the same content but different sentiments are usually
unavailable. Due to the lack of such parallel data, it is hard to extract
sentiment independent content and reverse the sentiment in an unsupervised way.
Previous work usually can not reconcile sentiment transformation and content
preservation. In this paper, motivated by the fact the non-emotional context
(e.g., "staff") provides strong cues for the occurrence of emotional words
(e.g., "friendly"), we propose a novel method that automatically extracts
appropriate sentiment information from learned sentiment memories according to
specific context. Experiments show that our method substantially improves the
content preservation degree and achieves the state-of-the-art performance.Comment: Accepted by EMNLP 201
A Discourse-Level Named Entity Recognition and Relation Extraction Dataset for Chinese Literature Text
Named Entity Recognition and Relation Extraction for Chinese literature text
is regarded as the highly difficult problem, partially because of the lack of
tagging sets. In this paper, we build a discourse-level dataset from hundreds
of Chinese literature articles for improving this task. To build a high quality
dataset, we propose two tagging methods to solve the problem of data
inconsistency, including a heuristic tagging method and a machine auxiliary
tagging method. Based on this corpus, we also introduce several widely used
models to conduct experiments. Experimental results not only show the
usefulness of the proposed dataset, but also provide baselines for further
research. The dataset is available at
https://github.com/lancopku/Chinese-Literature-NER-RE-Datase
Minimal Effort Back Propagation for Convolutional Neural Networks
As traditional neural network consumes a significant amount of computing
resources during back propagation, \citet{Sun2017mePropSB} propose a simple yet
effective technique to alleviate this problem. In this technique, only a small
subset of the full gradients are computed to update the model parameters. In
this paper we extend this technique into the Convolutional Neural Network(CNN)
to reduce calculation in back propagation, and the surprising results verify
its validity in CNN: only 5\% of the gradients are passed back but the model
still achieves the same effect as the traditional CNN, or even better. We also
show that the top- selection of gradients leads to a sparse calculation in
back propagation, which may bring significant computational benefits for high
computational complexity of convolution operation in CNN
Surfaces pinched by normal curvature for mean curvature flow in space forms
In this paper, we investigate the mean curvature flow of compact surfaces in
-dimensional space forms. We prove the convergence theorems for the mean
curvature flow under certain pinching conditions involving the normal
curvature, which generalise Baker-Nguyen's convergence theorem
Horizontal and Vertical Ensemble with Deep Representation for Classification
Representation learning, especially which by using deep learning, has been
widely applied in classification. However, how to use limited size of labeled
data to achieve good classification performance with deep neural network, and
how can the learned features further improve classification remain indefinite.
In this paper, we propose Horizontal Voting Vertical Voting and Horizontal
Stacked Ensemble methods to improve the classification performance of deep
neural networks. In the ICML 2013 Black Box Challenge, via using these methods
independently, Bing Xu achieved 3rd in public leaderboard, and 7th in private
leaderboard; Jingjing Xie achieved 4th in public leaderboard, and 5th in
private leaderboard
An Auto-Encoder Matching Model for Learning Utterance-Level Semantic Dependency in Dialogue Generation
Generating semantically coherent responses is still a major challenge in
dialogue generation. Different from conventional text generation tasks, the
mapping between inputs and responses in conversations is more complicated,
which highly demands the understanding of utterance-level semantic dependency,
a relation between the whole meanings of inputs and outputs. To address this
problem, we propose an Auto-Encoder Matching (AEM) model to learn such
dependency. The model contains two auto-encoders and one mapping module. The
auto-encoders learn the semantic representations of inputs and responses, and
the mapping module learns to connect the utterance-level representations.
Experimental results from automatic and human evaluations demonstrate that our
model is capable of generating responses of high coherence and fluency compared
to baseline models. The code is available at https://github.com/lancopku/AMMComment: Accepted by EMNLP 201
- …